Skip to content

Improve performance of getLinkedOps #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 7, 2018

Conversation

alecgibson
Copy link
Contributor

@alecgibson alecgibson commented Jun 6, 2018

Running getOps to fetch a large number of operations is not very
performant.

Anecdotally, performing this operation on a local development machine
with a document comprising of ~200,000 operations takes ~20s.

The largest slow-down appears to come from the getLinkedOps method,
and in particular the use of Array.unshift.

This change makes a very simple change that pushes to the array
instead of unshift, and then reverses it at the end.

The anecdotal operation time using this modified function is now ~2s,
which is an improvement of an order of magnitude.

@coveralls
Copy link

coveralls commented Jun 6, 2018

Coverage Status

Coverage remained the same at 93.033% when pulling f5ff21e on alecgibson:linked-op-performance-boost into 5a7c674 on share:master.

Copy link
Contributor

@gkubisa gkubisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is good but the new code could produce incorrect results in some cases.

Have you considered push + reverse as an alternative? It should also be pretty fast and probably a bit simpler.

index.js Outdated

if (link.equals ? !link.equals(op._id) : link !== op._id) {
linkedOps.splice(i, 1);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not guaranteed that the next op's id would be equal to link, so this code could produce incorrect results in some cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know which cases exactly? Can we add test coverage?

Copy link
Contributor

@gkubisa gkubisa Jul 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when the database returns 3 or more operations with the same version. It's extremely unlikely but possible. Operations with duplicate versions might be created when a ShareDB server stores an op but fails to store a corresponding snapshot, eg it crashes.

Actually, when I look at it again I think there are more issues here... i is an index in ops but you use it to remove items from linkedOps. Every time you remove from linkedOps, all following ops are re-indexed, so the next removal would be at a wrong index.

@alecgibson
Copy link
Contributor Author

@gkubisa I've done as you suggested, and just replaced with push/reverse. You're right; it's still just as fast. I really think this change is worthwhile. The getSnapshot change I'm trying to get merged on ShareDb is almost worthless without this change (especially when trying to work with 1,000,000+ ops!)

Running `getOps` to fetch a large number of operations is not very
performant.

Anecdotally, performing this operation on a local development machine
with a document comprising of ~200,000 operations takes ~20s.

The largest slow-down appears to come from the `getLinkedOps` method,
and in particular the use of `Array.unshift`.

This change makes a very simple change that `push`es to the array
instead of `unshift`, and then `reverse`s it at the end.

The anecdotal operation time using this modified function is now ~2s,
which is an improvement of an order of magnitude.
@alecgibson alecgibson force-pushed the linked-op-performance-boost branch from 9a459eb to 66f6533 Compare July 12, 2018 16:18
@gkubisa
Copy link
Contributor

gkubisa commented Jul 13, 2018

Looks good to me. Thanks @alecgibson 👍

@alecgibson
Copy link
Contributor Author

@gkubisa do you know if anyone is still maintaining this repo? Or am I just going to have to fork?

@gkubisa
Copy link
Contributor

gkubisa commented Jul 13, 2018

I maintain https://github.com/teamwork/sharedb-mongo and will merge this PR today. PRs should also start getting merged into this repo soon, hopefully after the next review meeting on Wednesday.

@alecgibson
Copy link
Contributor Author

@gkubisa tangentially related - do you know if there's anything we can do about how flaky these tests are?! I'm assuming I need a green build to merge?

@gkubisa
Copy link
Contributor

gkubisa commented Jul 13, 2018

The build is ok at https://github.com/teamwork/sharedb-mongo thanks to @dcharbonnier:

Hopefully those changes will make it into the official repos soon.

@gkubisa
Copy link
Contributor

gkubisa commented Jul 13, 2018

@alecgibson I've just merged this branch into my fork and published to npm: https://www.npmjs.com/package/@teamwork/sharedb-mongo

Copy link
Contributor

@gkubisa gkubisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to merge 👍

@gkubisa
Copy link
Contributor

gkubisa commented Jul 19, 2018

@alecgibson You can get the tests passing by updating sharedb-mingo-memory to ^1.0.1. See c147fc2.

@alecgibson
Copy link
Contributor Author

alecgibson commented Jul 19, 2018

@gkubisa I've bumped the version, but there's possibly two more flaky tests? https://travis-ci.org/share/sharedb-mongo/jobs/405801814

@gkubisa
Copy link
Contributor

gkubisa commented Jul 19, 2018

Ok, that's for a very old version of node and MongoDB. I'd just update the .travis.yml file to test the supported node versions and fewer MongoDB versions like here: https://github.com/share/sharedb-mongo/pull/64/files#diff-354f30a63fb0907d4ad57269548329e3.

@ericyhwang this PR #64 updates all modules and tested node and MongoDB versions. Could you review and merge it when you have a moment, please?

@alecgibson
Copy link
Contributor Author

@gkubisa I've opened a new PR to deal with test flakiness separately: #65

@alecgibson alecgibson force-pushed the linked-op-performance-boost branch from f5ff21e to 66f6533 Compare July 19, 2018 13:39
Copy link
Contributor

@ericyhwang ericyhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that all the test flakiness PRs have been merged, this is good to go. Thanks again for diagnosing the issue and fixing it!

I tested this with the latest sharedb-mongo master, and the tests all pass, so I'm going to go ahead and merge+publish to not keep the PR waiting around any longer.

@ericyhwang ericyhwang merged commit e92ba8b into share:master Aug 7, 2018
@ericyhwang
Copy link
Contributor

Published as [email protected]

@alecgibson alecgibson deleted the linked-op-performance-boost branch August 7, 2018 05:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants